Running title: Covariance models of RNA RNA Sequence Analysis Using Covariance Models
نویسندگان
چکیده
We describe a general approach to several RNA sequence analysis problems using probabilistic models that exibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models \covariance models". A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences. Subject: Computational biology
منابع مشابه
RNA sequence analysis using covariance models.
We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences i...
متن کاملFinding local RNA motifs using covariance models
We present DISCO, an algorithm to detect conserved motifs in sets of unaligned RNA sequences. Our algorithm uses covariance models (CM) to represent motifs. We introduce a novel approach to initialise a CM using pairwise and multiple sequence alignment. The CM is then iteratively refined. We tested our algorithm on 26 data sets derived from Rfam seed alignments of microRNA (miRNA) precursors an...
متن کاملThermodynamic matchers for the construction of the cuckoo RNA family
RNA family models describe classes of functionally related, non-coding RNAs based on sequence and structure conservation. The most important method for modeling RNA families is the use of covariance models, which are stochastic models that serve in the discovery of yet unknown, homologous RNAs. However, the performance of covariance models in finding remote homologs is poor for RNA families wit...
متن کاملCMfinder - a covariance model based RNA motif finding algorithm
MOTIVATION The recent discoveries of large numbers of non-coding RNAs and computational advances in genome-scale RNA search create a need for tools for automatic, high quality identification and characterization of conserved RNA motifs that can be readily used for database search. Previous tools fall short of this goal. RESULTS CMfinder is a new tool to predict RNA motifs in unaligned sequenc...
متن کاملCMCompare webserver: comparing RNA families via covariance models
A standard method for the identification of novel non-coding RNAs is homology search by covariance models. Covariance models are constructed for specific RNA families with common sequence and structure (e.g. transfer RNAs). Currently, there are models for 2208 families available from Rfam. Before being included into a database, a proposed family should be tested for specificity (finding only tr...
متن کامل